Photo-Realistic Mouth Animation Based on an Asynchronous Articulatory DBN Model for Continuous Speech

نویسندگان

  • He Zhang
  • Dongmei Jiang
  • Peng Wu
  • Hichem Sahli
چکیده

This paper proposes a continuous speech driven photo realistic visual speech synthesis approach based on an articulatory dynamic Bayesian network model (AF_AVDBN) with constrained asynchrony. In the training of the AF_AVDBN model, the perceptual linear prediction (PLP) features and YUV features are extracted as acoustic and visual features respectively. Given an input speech and the trained AF_AVDBN parameters, an EM-based algorithm is deduced to learn the optimal YUV features, which are then used, together with the compensated high frequency components, to synthesize the mouth animation corresponding to the input speech. In the experiments, mouth animations are synthesized for 80 connected digit speech sentences. Both qualitative and quantitative evaluation results show that the proposed method is capable of synthesizing more natural, clear and accurate mouth animations than those from the state asynchronous DBN model (S_A_DBN).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Photo-realistic visual speech synthesis based on AAM features and an articulatory DBN model with constrained asynchrony

This paper presents a photo realistic visual speech synthesis method based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN) in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/velum, can be controlled. Perceptual linear prediction (PLP) features from the audio speech and active appearance model (AAM) features from mouth ...

متن کامل

Startegies and results for the evaluation of the naturalness of the LIPPS facial animation system

The paper describes strategy and results for an evaluation of the naturalness of a facial animation system with the help of hearing-impaired persons. It shows perspectives for improvement of the facial animation model, independent on the animation model itself. The fundamental thesis of the evaluation is that the comparison of presented and perceived visual information has to be performed on ba...

متن کامل

A coupled HMM approach to video-realistic speech animation

We propose a coupled hidden Markov model (CHMM) approach to video-realistic speech animation, which realizes realistic facial animations driven by speaker independent continuous speech. Different from hidden Markov model (HMM)-based animation approaches that use a singlestate chain, we use CHMMs to explicitly model the subtle characteristics of audio–visual speech, e.g., the asynchrony, tempora...

متن کامل

Artimate: an articulatory animation framework for audiovisual speech synthesis

We present a modular framework for articulatory animation synthesis using speech motion capture data obtained with electromagnetic articulography (EMA). Adapting a skeletal animation approach, the articulatory motion data is applied to a threedimensional (3D) model of the vocal tract, creating a portable resource that can be integrated in an audiovisual (AV) speech synthesis platform to provide...

متن کامل

Viseme­aware Realistic 3d Face Modeling from Range Images

In this paper, we propose an example based realistic face modeling method with viseme control. The viseme describes the particular facial and oral positions and movements that occur alongside the voicing of phonemes. In facial animation such as speech animation and talking head, etc, a face model with open mouth is often used and the model is animated along with the speech sound by synchronizin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011